智能论文笔记

The Open corpus of the Veps and Karelian languages: overview and applications

Tatyana Boyko , Nina Zaitseva , Natalia Krizhanovskaya , Andrew Krizhanovsky , Irina Novak , Nataliya Pellinen , Aleksandra Rodionova

分类：自然语言处理

2022-06-08

卡雷利亚共和国的波罗的海语言的研究越来越重视是语料库语言学的方法和工具。自2016年以来，Karelian研究中心的语言学家，数学家和程序员一直在与VEPS和Karelian语言的开放语料库（VEPKAR）合作，这是2009年创建的VEPS Corpus的扩展。和VEP，与它们相关的多功能字典以及具有高级搜索系统的软件，使用各种文本（语言，流派等）和许多语言类别（在文本中实现了文本中的词汇和语法搜索，这要归功于Word的生成器我们之前创建的表单）。编译了3000个文本的语料库，上传和标记了文本，将文本分类为语言，方言，类型和流派的系统，并创建了单词形式的生成器。未来的计划包括开发用于使用音频记录的语音模块和使用形态分析输出的句法标记模块。由于语料库管理器和正在进行的VEPKAR的持续功能进步，并具有新的材料和文本标记，用户可以处理广泛的科学和应用任务。在创建全国性国家VEPKAR语料库时，其开发商和经理在19-21世纪努力保护和展示VEP和Karelian语言状态。

translated by 谷歌翻译

UniMorph 4.0: Universal Morphology

Khuyagbaatar Batsuren , Omer Goldman , Salam Khalifa , Nizar Habash , Witold Kieraś , Gábor Bella , Brian Leonard , Garrett Nicolai , Kyle Gorman , Yustinus Ghanggo Ate

分类：自然语言处理

2022-05-07

通用形态（UNIMORPH）项目是一项合作的努力，可为数百种世界语言实例化覆盖范围的标准化形态拐角。该项目包括两个主要的推力：一种无独立的特征架构，用于丰富的形态注释，并以各种语言意识到该模式的各种语言的带注释数据的类型级别资源。本文介绍了过去几年对几个方面的扩张和改进（自McCarthy等人（2020年）以来）。众多语言学家的合作努力增加了67种新语言，其中包括30种濒危语言。我们已经对提取管道进行了一些改进，以解决一些问题，例如缺少性别和马克龙信息。我们还修改了模式，使用了形态学现象所需的层次结构，例如多肢体协议和案例堆叠，同时添加了一些缺失的形态特征，以使模式更具包容性。鉴于上一个UniMorph版本，我们还通过16种语言的词素分割增强了数据库。最后，这个新版本通过通过代表来自metphynet的派生过程的实例丰富数据和注释模式来推动将衍生物形态纳入UniMorph中。

translated by 谷歌翻译

Towards Holistic Surgical Scene Understanding

Natalia Valderrama , Paola Ruiz Puentes , Isabela Hernández , Nicolás Ayobi , Mathilde Verlyk , Jessica Santander , Juan Caicedo , Nicolás Fernández , Pablo Arbeláez

分类：计算机视觉 | 人工智能

2022-12-08

Most benchmarks for studying surgical interventions focus on a specific challenge instead of leveraging the intrinsic complementarity among different tasks. In this work, we present a new experimental framework towards holistic surgical scene understanding. First, we introduce the Phase, Step, Instrument, and Atomic Visual Action recognition (PSI-AVA) Dataset. PSI-AVA includes annotations for both long-term (Phase and Step recognition) and short-term reasoning (Instrument detection and novel Atomic Action recognition) in robot-assisted radical prostatectomy videos. Second, we present Transformers for Action, Phase, Instrument, and steps Recognition (TAPIR) as a strong baseline for surgical scene understanding. TAPIR leverages our dataset's multi-level annotations as it benefits from the learned representation on the instrument detection task to improve its classification capacity. Our experimental results in both PSI-AVA and other publicly available databases demonstrate the adequacy of our framework to spur future research on holistic surgical scene understanding.

translated by 谷歌翻译

Self-Supervised Correspondence Estimation via Multiview Registration

Mohamed El Banani , Ignacio Rocco , David Novotny , Andrea Vedaldi , Natalia Neverova , Justin Johnson , Benjamin Graham

分类：计算机视觉

2022-12-06

Video provides us with the spatio-temporal consistency needed for visual learning. Recent approaches have utilized this signal to learn correspondence estimation from close-by frame pairs. However, by only relying on close-by frame pairs, those approaches miss out on the richer long-range consistency between distant overlapping frames. To address this, we propose a self-supervised approach for correspondence estimation that learns from multiview consistency in short RGB-D video sequences. Our approach combines pairwise correspondence estimation and registration with a novel SE(3) transformation synchronization algorithm. Our key insight is that self-supervised multiview registration allows us to obtain correspondences over longer time frames; increasing both the diversity and difficulty of sampled pairs. We evaluate our approach on indoor scenes for correspondence estimation and RGB-D pointcloud registration and find that we perform on-par with supervised approaches.

translated by 谷歌翻译

Towards a more efficient computation of individual attribute and policy contribution for post-hoc explanation of cooperative multi-agent systems using Myerson values

Giorgio Angelotti , Natalia Díaz-Rodríguez

分类：人工智能 | 机器学习

2022-12-06

A quantitative assessment of the global importance of an agent in a team is as valuable as gold for strategists, decision-makers, and sports coaches. Yet, retrieving this information is not trivial since in a cooperative task it is hard to isolate the performance of an individual from the one of the whole team. Moreover, it is not always clear the relationship between the role of an agent and his personal attributes. In this work we conceive an application of the Shapley analysis for studying the contribution of both agent policies and attributes, putting them on equal footing. Since the computational complexity is NP-hard and scales exponentially with the number of participants in a transferable utility coalitional game, we resort to exploiting a-priori knowledge about the rules of the game to constrain the relations between the participants over a graph. We hence propose a method to determine a Hierarchical Knowledge Graph of agents' policies and features in a Multi-Agent System. Assuming a simulator of the system is available, the graph structure allows to exploit dynamic programming to assess the importances in a much faster way. We test the proposed approach in a proof-of-case environment deploying both hardcoded policies and policies obtained via Deep Reinforcement Learning. The proposed paradigm is less computationally demanding than trivially computing the Shapley values and provides great insight not only into the importance of an agent in a team but also into the attributes needed to deploy the policy at its best.

translated by 谷歌翻译

MobileTL: On-device Transfer Learning with Inverted Residual Blocks

Hung-Yueh Chiang , Natalia Frumkin , Feng Liang , Diana Marculescu

分类：机器学习 | 人工智能

2022-12-05

Transfer learning on edge is challenging due to on-device limited resources. Existing work addresses this issue by training a subset of parameters or adding model patches. Developed with inference in mind, Inverted Residual Blocks (IRBs) split a convolutional layer into depthwise and pointwise convolutions, leading to more stacking layers, e.g., convolution, normalization, and activation layers. Though they are efficient for inference, IRBs require that additional activation maps are stored in memory for training weights for convolution layers and scales for normalization layers. As a result, their high memory cost prohibits training IRBs on resource-limited edge devices, and making them unsuitable in the context of transfer learning. To address this issue, we present MobileTL, a memory and computationally efficient on-device transfer learning method for models built with IRBs. MobileTL trains the shifts for internal normalization layers to avoid storing activation maps for the backward pass. Also, MobileTL approximates the backward computation of the activation layer (e.g., Hard-Swish and ReLU6) as a signed function which enables storing a binary mask instead of activation maps for the backward pass. MobileTL fine-tunes a few top blocks (close to output) rather than propagating the gradient through the whole network to reduce the computation cost. Our method reduces memory usage by 46% and 53% for MobileNetV2 and V3 IRBs, respectively. For MobileNetV3, we observe a 36% reduction in floating-point operations (FLOPs) when fine-tuning 5 blocks, while only incurring a 0.6% accuracy reduction on CIFAR10. Extensive experiments on multiple datasets demonstrate that our method is Pareto-optimal (best accuracy under given hardware constraints) compared to prior work in transfer learning for edge devices.

translated by 谷歌翻译

Common Pets in 3D: Dynamic New-View Synthesis of Real-Life Deformable Categories

Samarth Sinha , Roman Shapovalov , Jeremy Reizenstein , Ignacio Rocco , Natalia Neverova , Andrea Vedaldi , David Novotny

分类：计算机视觉

2022-11-07

Obtaining photorealistic reconstructions of objects from sparse views is inherently ambiguous and can only be achieved by learning suitable reconstruction priors. Earlier works on sparse rigid object reconstruction successfully learned such priors from large datasets such as CO3D. In this paper, we extend this approach to dynamic objects. We use cats and dogs as a representative example and introduce Common Pets in 3D (CoP3D), a collection of crowd-sourced videos showing around 4,200 distinct pets. CoP3D is one of the first large-scale datasets for benchmarking non-rigid 3D reconstruction "in the wild". We also propose Tracker-NeRF, a method for learning 4D reconstruction from our dataset. At test time, given a small number of video frames of an unseen object, Tracker-NeRF predicts the trajectories of its 3D points and generates new views, interpolating viewpoint and time. Results on CoP3D reveal significantly better non-rigid new-view synthesis performance than existing baselines.

translated by 谷歌翻译

Greybox XAI: a Neural-Symbolic learning framework to produce interpretable predictions for image classification

Adrien Bennetot , Gianni Franchi , Javier Del Ser , Raja Chatila , Natalia Diaz-Rodriguez

分类：计算机视觉 | 人工智能 | 机器学习

2022-09-26

尽管深度神经网络（DNNS）具有很大的概括和预测能力，但它们的功能不允许对其行为进行详细的解释。不透明的深度学习模型越来越多地用于在关键环境中做出重要的预测，而危险在于，它们做出和使用不能合理或合法化的预测。已经出现了几种可解释的人工智能（XAI）方法，这些方法与机器学习模型分开了，但对模型的实际功能和鲁棒性具有忠诚的缺点。结果，就具有解释能力的深度学习模型的重要性达成了广泛的协议，因此他们自己可以为为什么做出特定的预测提供答案。首先，我们通过形式化解释是什么是缺乏XAI的普遍标准的问题。我们还引入了一组公理和定义，以从数学角度阐明XAI。最后，我们提出了Greybox XAI，该框架由于使用了符号知识库（KB）而构成DNN和透明模型。我们从数据集中提取KB，并使用它来训练透明模型（即逻辑回归）。在RGB图像上训练了编码器 - 编码器架构，以产生类似于透明模型使用的KB的输出。一旦两个模型被独立训练，它们就会在组合上使用以形成可解释的预测模型。我们展示了这种新体系结构在几个数据集中如何准确且可解释的。

translated by 谷歌翻译

Transformer-based classification of premise in tweets related to COVID-19

Vadim Porvatov , Natalia Semenova

分类：自然语言处理 | 人工智能 | 机器学习

2022-09-08

社交网络数据评估的自动化是自然语言处理的经典挑战之一。在共同199年的大流行期间，关于了解健康命令的态度，公共信息中的采矿人们的立场变得至关重要。在本文中，作者提出了基于变压器体系结构的预测模型，以对Twitter文本中的前提进行分类。这项工作是作为2022年社交媒体挖掘（SMM4H）研讨会的一部分完成的。我们探索了现代变压器的分类器，以便构建管道有效地捕获推文语义。我们在Twitter数据集上的实验表明，在前提预测任务的情况下，罗伯塔（Roberta）优于其他变压器模型。该模型在ROC AUC值0.807方面实现了竞争性能，而F1得分为0.7648。

translated by 谷歌翻译

Automatic Ultrasound Image Segmentation of Supraclavicular Nerve Using Dilated U-Net Deep Learning Architecture

Mizuki Miyatake , Subhash Nerella , David Simpson , Natalia Pawlowicz , Sarah Stern , Patrick Tighe , Parisa Rashidi

分类：计算机视觉 | 机器学习

2022-08-09

医学图像中的自动对象识别可以促进医学诊断和治疗。在本文中，我们自动对超声图像中的锁骨神经进行了分割，以帮助注入周围神经块。神经块通常用于手术后的疼痛治疗，其中使用超声指导在靶神经旁边注入局部麻醉药。这种治疗可以阻止疼痛信号向大脑的传播，这可以帮助提高手术中的恢复速率，并显着减少术后阿片类药物的需求。但是，超声引导的区域麻醉（UGRA）要求麻醉师在视觉上识别超声图像中的实际神经位置。鉴于超声图像中神经的无视觉效果以及它们与许多相邻组织的视觉相似性，这是一项复杂的任务。在这项研究中，我们使用了自动神经检测系统进行UGRA神经阻滞治疗。该系统可以使用深度学习技术识别神经在超声图像中的位置。我们开发了一个模型来捕获神经的特征，通过训练两个具有跳过连接的深神经网络：两种扩展的U-NET体系结构，有或没有扩张的卷积。该溶液可能会导致区域麻醉中靶向神经的封锁。

translated by 谷歌翻译